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Description 

5 SERINE PROTEASE POLYPEPTIDES AND 

MATERIALS AND METHODS FOR MAKING THEM 

BACKGROUND OF THE INVENTION 

Enzymes are used within a wide range of 
10 applications in industry, research, and medicine. Through 
the use of enzymes, industrial processes can be carried 
out at reduced temperatures and pressures and with less 
dependence on the use of corrosive or toxic substances. 
The use of enzymes can thus reduce production costs, 
ry 15 energy consumption, and pollution as compared to non- 

enzymatic products and processes. 

An important group of enzymes is the proteases, 
which cleave proteins. Industrial applications of 

proteases include food processing, brewing, and alcohol 
2 0 production. Proteases are important components of laundry 
detergents and other products. Within biological 

research, proteases are used in purification processes to 
degrade unwanted proteins. It is often desirable to 
employ proteases of low specificity or mixtures of more 
25 specific proteases to obtain the necessary degree of 
degradation . 

Proteases are also key components of a broad 
range of biological pathways, including blood coagulation 
and digestion. For example, the absence or insufficiency 
30 of a protease can result in a pathological condition that 
can be treated by replacement or augmentation therapy. 
Such therapies include the treatment of hemophilia with 
clotting factors VIII, IX, and Vila. In another 

application, the proteolytic enzyme tissue plasminogen 
35 activator (t-PA) is used to activate the body's clot 
lysing mechanism, thereby reducing morbitity resulting 
from myocardial infarction. The protease thrombin is used 



to initiate the clotting of f ibrinogen-based tissue 
adhesives during surgery. Neutrophils produce several 
antibacterial serine proteases (Gabay, Ciba Found . Symp . 
186 : 237-247 , 1994; Scocchi et al . , Eur. J. ■ Biochem. 
209 : 589-595 , 1992). Proteases also regulate cellular 

processes through receptor-mediated pathways by 
proteolytic activation of the cognate receptor (Vu et al . , 
Cell 64.: 1057-1068 , 1991; Blackhart et al . , J. Biol. Chem. 
271 : 16466-16471 , 1996). 

Overproduction or lack of regulation of 
proteases can also have pathological consequences. 
Elastase, released within the lung in response to the 
presence of foreign particles, can damage lung tissue if 
its activity is not tightly regulated. Emphysema in 
smokers is believed to arise from an imbalance between 
elastase and its inhibitor, alpha-l-antitrypsin . This 
balance may be restored by administration of exogenous 
alpha-l-antitrypsin. 

One family of proteases of particular interest 
is the serine proteases, which are characterized by a 
catalytic triad of serine, histidine, and aspartic acid 
residues. Serine proteases are used for a variety of 
industrial purposes. For example, the serine protease 
subtilisin is used in laundry detergents to aid in the 
removal of proteinaceous stains (e.g., Crabb, ACS 
Symposium Series 460 : 82-94 , 1991) . In the food processing 
industry, serine proteases are used to produce protein- 
rich concentrates from fish and livestock, and in the 
preparation of dairy products (Kida et al . , Journal of 
Fermentation and Bioengineerinq 8.0:478-484, 1995; Haard 
and Simpson, in Martin, A.M., ed. , Fisheries Processing: 
Biotechnolocrical Applications , Chapman and Hall, London, 
1994, 132-154; Bos et al . , European Patent Office 
Publication 494 149 Al) . 

In general, enzymes, including proteases, are 
active over a narrow range of environmental conditions 
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(temperature, pH, etc.), and many are highly specific for 
particular substrates. The narrow range of activity for a 
given enzyme limits its applicability and creates a need 
for a selection of enzymes that (a) have similar 
5 activities but are active under different conditions or 
(b) have different substrates. For instance, an enzyme 
capable of catalyzing a reaction at 50 °C may be so 
inefficient at 35°C that its use at the lower temperature 
will not be feasible. For this reason, laundry detergents 
10 generally contain a selection of proteolytic enzymes, 
allowing the detergent to be used over a broad range of 
q wash temperature and pH . 

In view of the specificity of proteolytic 
enzymes and the growing use of proteases in industry, 
15 research, and medicine, there is an ongoing need in the 
art for new enzymes and new enzyme inhibitors. The 
present invention addresses these needs and provides 
other, related advantages. 

2 0 SUMMARY OF THE INVENTION 
Within one aspect, the present invention 

provides an isolated protein comprising a sequence of 
amino acid residues that is at least 95% identical to SEQ 
ID NO: 2 from lie, residue 111, through Asn, residue 3 73, 
25 wherein the protein is a protease or protease precursor. 
In one embodiment, the protein has from 263 to 398 amino 
acid residues. In other embodiments, the protein 

comprises residues 111 through 373 of SEQ ID NO : 2 or SEQ 
ID NO: 15, residues 110 through 373 of SEQ ID NO : 2 or SEQ 

3 0 ID NO: 15, or residues 1 through 3 73 of SEQ ID NO : 2 or SEQ 
ID NO: 15. The protein can further comprise a heterologous 
affinity tag or binding domain. 

Within a second aspect, the invention provides 
an isolated polynucleotide up to 1800 nucleotides in 
35 length encoding a protein as disclosed above. Within one 
embodiment, the polynucleotide is DNA. Within another 



embodiment, the polynucleotide is double -stranded DNA. 
Within a further embodiment, the protein encoded by the 
polynucleotide comprises residues -19 through 373 of SEQ 
ID NO : 2 . 

Within a third aspect, the invention provides an 
expression vector comprising the following operably linked 
elements: (a) a transcription promoter; (b) a DNA segment 
encoding a protein as disclosed above; and (c) a 
transcription terminator. The expression vector can 

further comprise a secretory signal sequence operably 
linked to the DNA segment . 

The invention also provides a cultured cell 
containing an expression vector as disclosed above, 
wherein the cell expresses the DNA segment. Within, one 
embodiment of the invention the expression vector further 
comprises a secretory signal sequence operably linked to 
the DNA segment, and the cell secretes the protein. 

There is also provided a method of making a 
protease or protease precursor. The method comprises the 
steps of (a) providing a host cell containing an 
expression vector as disclosed above; (b) culturing the 
host cell under conditions whereby the DNA segment is 
expressed; and (c) recovering the protein encoded by the 
DNA segment . Within one embodiment the expression vector 
further comprises a secretory signal sequence operably 
linked to the DNA segment, the cell secretes the protein 
into a culture medium, and the protein is recovered from 
the medium. 

Within a further aspect of the invention there 
is provided a method of cleaving a peptide bond of a 
substrate protein. The method comprises incubating the 
substrate protein in the presence of a second protein 
comprising a sequence of amino acid residues that is at 
least 95% identical to SEQ ID NO : 2 from lie, residue 111, 
through Asn, residue 373, whereby the peptide bond is 
cleaved. Within one embodiment, the second protein is a 
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protease precursor and the method further comprises the 
step of activating the second protein before the peptide 
bond is cleaved. 

The invention further provides a method of 
5 detecting an inhibitor of proteolysis within a test sample 
comprising the steps of (a) measuring proteolytic activity 
of a protein as disclosed above in the presence of a test 
sample to obtain a first value; (b) measuring proteolytic 
activity of the protein in the absence of the test sample 
10 to obtain a second value; and (c) comparing the first and 
second values, whereby a higher second value relative to 
□ the first value is indicative of an inhibitor of 

~ proteolysis within the test sample. 

iji The invention also provides an antibody that 

! - 15 specifically binds to a protein comprising a sequence of 

jr amino acid residues that is at least 95% identical to SEQ 

MU ID NO : 2 from lie, residue 111, through Asn, residue 373, 

i= wherein the protein is a protease or protease precursor. 

it p Within an additional aspect, the invention 

H 2 0 provides a DNA construct encoding a polypeptide fusion. 

:£l The polypeptide fusion comprises, from amino terminus to 

CO carboxyl terminus, amino acid residues -19 through -1 of 

SEQ ID NO : 2 operably linked to an additional polypeptide. 

These and other aspects of the invention will 
2 5 become evident upon reference to the following detailed 

description of the invention. 

DETAILED DESCRIPTION OF THE INVENTION 

Prior to setting forth the invention in detail, 
30 certain terms used herein will be defined. 

The term "allelic variant" denotes any of two or 
more alternative forms of a gene occupying the same 
chromosomal locus. Allelic variation arises naturally 
through mutation, and may result in phenotypic 
35 polymorphism within populations. Gene mutations can be 
silent (no change in the encoded polypeptide) or may 
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encode polypeptides having altered amino acid sequence. 
The term "allelic variant" is also used herein to denote a 
protein encoded by an allelic variant of a gene. 

The term "complements of polynucleotide 
5 molecules" denotes polynucleotide molecules having a 
complementary base sequence and reverse orientation as 
compared to a reference sequence. For example, the 
sequence 5 1 ATGCACGGG 3 ' is complementary to 5 ' CCCGTGCAT 
3 ' . 

10 The term "degenerate nucleotide sequence" 

denotes a sequence of nucleotides that includes one or 
more degenerate codons (as compared to a reference 
polynucleotide molecule" that encodes a polypeptide) . 
Degenerate codons contain different triplets of 

15 nucleotides, but encode the same amino acid residue (i.e., 
GAU and GAC triplets each encode Asp) . 

A "DNA construct" is a single or double 
stranded, linear or circular DNA molecule that comprises 
segments of DNA combined and juxtaposed in a manner not 

20 found in nature. DNA constructs exist as a result of 
human manipulation, and include clones and other copies of 
manipulated molecules. 

A "DNA segment" is a portion of a larger DNA 
molecule having specified attributes. For example, a DNA 

25 segment encoding a specified polypeptide is a portion of a 
longer DNA molecule, such as a plasmid or plasmid 
fragment, that, when read from the 5' to the 3' direction, 
encodes the sequence of amino acids of the specified 
polypeptide . 

3 0 The term "expression vector" denotes a DNA 

construct that comprises a segment encoding a polypeptide 
of interest operably linked to additional segments that 
provide for its transcription in a host cell. Such 
additional segments may include promoter and terminator 

3 5 sequences, and may optionally include one or more origins 
of replication, one or more selectable markers, an 
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enhancer, a polyadenylat ion signal, and the like. 
Expression vectors are generally derived from plasmid or 
viral DNA, or may contain elements of both. 

The term "isolated", when applied to a 
5 polynucleotide molecule, denotes that the polynucleotide 
has been removed from its natural genetic milieu and is 
thus free of other extraneous or unwanted coding 
sequences, and is in a form suitable for use within 
genetically engineered protein production systems. Such 
10 isolated molecules are those that are separated from their 
: natural environment and include cDNA and genomic clones, 

Q as well as synthetic polynucleotides. Isolated DNA 

i=i molecules of the present invention may include naturally 

ijfi occurring 5 ' and 3 1 untranslated regions such as promoters 

jj; 15 and terminators. The identification of associated regions 

Jz will be evident to one of ordinary skill in the art (see 

;u for example, Dynan and Tijan, Nature 316 : 774-78 , 1985) . 

fj When applied to a protein, the term "isolated" indicates 

=F that the protein is found in a condition other than its 

!~j 2 0 native environment, such as apart from blood and animal 

|j tissue. In a preferred form, the isolated protein is 

^ substantially free of other proteins, particularly other 

proteins of animal origin. It is preferred to provide the 
protein in a highly purified form, i.e., at least 90% 

2 5 pure, preferably greater than 95% pure, more preferably 

greater than 99% pure. 

The term "operably linked", when referring to 
DNA segments, denotes that the segments are arranged so 
that they function in concert for their intended purposes, 

3 0 e.g. transcription initiates in the promoter and proceeds 

through the coding segment to the terminator. 

The term "ortholog" denotes a polypeptide or 
protein obtained from one species that is the functional 
counterpart of a polypeptide or protein from a different 
35 species. Sequence differences among orthologs are the 
result of speciation. 
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The term "polynucleotide" denotes a single- or 
double -stranded polymer of deoxyribonucleotide or 
ribonucleotide bases read from the 5' to the 3' end. 
Polynucleotides include RNA and DNA, and may be isolated 
5 from natural sources, synthesized in vitro, or prepared 

from a combination of natural and synthetic molecules. 
The length of a polynucleotide molecule is given herein in 
terms of nucleotides (abbreviated "nt") or base pairs 
(abbreviated "bp") . The term "nucleotides" is used for 
10 both single- and double -stranded molecules where the 
context permits. When the term is applied to double- 
;~J stranded molecules it is used to denote overall length and 

Q will be understood to be equivalent to the term "base 

1 pairs" . It will be recognized by those skilled in ithe art 

; y 

15 that the two strands of a double -stranded polynucleotide 
may differ slightly in length and that the ends thereof 
may be staggered as a result of enzymatic cleavage; thus 
□ ' all nucleotides within a double -stranded polynucleotide 

^7 molecule may not be paired. Such unpaired ends will in 

u 20 general not exceed 20 nt in length. 

t ; The term "promoter" denotes a portion of a gene 

containing DNA sequences that provide for the binding of 
RNA polymerase and initiation of transcription. Promoter 
sequences are commonly, but not always, found in the 5' 

25 non-coding regions of genes. 

A "protease" is an enzyme that cleaves peptide 
bonds in proteins. A "protease precursor" is a relatively 
inactive form of the enzyme that commonly -becomes 
activated upon cleavage by another protease. 

3 0 The term "secretory signal sequence" denotes a 

DNA sequence that encodes a polypeptide (a "secretory 
peptide") that, as a component of a larger polypeptide, 
directs the larger polypeptide through a secretory pathway 
of a cell in which it is synthesized. The. larger 

35 polypeptide is commonly cleaved to remove the secretory 
peptide during transit through the secretory pathway. 
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All references cited herein are incorporated by- 
reference in their entirety. 

The present invention provides novel serine 
proteases, serine protease precursors, and useful 
5 polypeptide fragments thereof . The sequence of a 

representative protein of the present invention is shown 
in SEQ ID NO : 2 . This protein shows significant amino acid 
sequence homology to several serine proteases, including 
Bacillus licheniformis glutamyl endopept idase (Svendsen 

10 and Breddam, Eur. J. Biochem. 204 :165-171, 1992) , human 
clotting factor X (Leytus et al . , Biochem . 25:5 0 98-5102, 

1986) , human elastase (Kawashima et al . , DNA 6:163-172, 

1987) , rat mast cell protease (Benfey et al . , J. Biol . 
Chem. 262 : 5377-5384 . 1987), Streptomyces griseus , trypsin 

15 (Kim et al . , Biochem. Biophys . Res. Comm. 181 :707-713, 
1991) , Hypoderma lineatum collagenase ( J. Biol. Chem. 

262 : 7546-7551 . 1987), and. bovine trypsinogen (Titani et 
al . , Biochem. 14:1358-1366, 1975). The protein has been 
designated "Zsigl3". 

20 A Zsigl3 polynucleotide sequence was initially 

identified by querying a database of expressed sequence 
tags (ESTs) for secretory signal sequences characterized 
by an upstream methionine start site, a hydrophobic region 
of approximately 13 amino acid residues, and a cleavage 

25 site as defined by von Heijne ( Nuc. Acids Res. .14:4683, 
1986) . Analysis of a full-length DNA (shown in SEQ ID 
NO:l) revealed its homology with other members of the 
serine protease family. Northern blot analysis indicated 
the presence of two corresponding messages, a predominant 

3 0 transcript of approximately 1.8 kb and. a secondary 
transcript of approximately 4 kb . The sequence of SEQ ID 
NO : 1 consists of 1634 bp, not including a poly (A) tail. 
The sequence includes an open reading frame of 1176 base 
pairs . 

35 An alignment of Zsigl3 with related proteins was 

used to identify the catalytic triad of His (156), Asp 
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(227) and Ser (322) as shown in SEQ ID NO : 2 . The Leu-Thr- 
Ala-Ala-His-Cys sequence (residues 152-157 of SEQ ID NO: 2) 
is a characteristic active site His signature within 
serine proteases. Resides -1 through -19 of SEQ ID NO : 2 
5 make up a putative signal peptide. Residues 106-109 of 
SEQ ID NO: 2 (Arg-Arg-Lys-Arg) are a characteristic 
cleavage site; such cleavage may serve a regulatory 
function, such as activation of the protein during or 
after secretion. Activation by proteolytic cleavage is 
10 common among serine proteases. While not wishing to be 
bound by theory, the protein is believed to become active 
g following exposure of a free amino group on Gin 110 or, 

with additional processing, lie 111. However,, in contrast 
ji to many other serine proteases, the non-catalytic, amino- 

U 15 terminal fragment does not appear to remain tethered to 

p the remainder of the molecule after this cleavage has 

U occurred. Alignment of sequences further indicates that 

!_ active site contact residues are at positions 244 (lie), 

j= 291 (Asp), 292 (Ala), 316 (Lys), 317 (lie), 328 (Asp), 350 

20 (lie), 356 (Gly) , 358 (Tyr) and 360 (Asp) of SEQ ID NO : 2 . 
Sequence alignment identified the Lys residue at position 
B 316 as the key residue in the base of the PI ligand 

specificity pocket, generating specificity for Glu and/or 
Asp in the PI position of the substrate protein. 
25 With reference to SEQ ID NO: 2, additional 

structural features of Zsigl3 include paired cysteine 
residues at positions 46 and 50, 141 and 157, 276 and 290, 
and 351 and 361. Potential N- linked glycosylation sites 
are at residues Asn-74 and Asn-188. The calculated 
3 0 molecular weight of the peptide backbone • of the 3 92- 
residue precursor is 43,829.55, with a predicted pi of 
10.44. The calculated peptide backbone molecular weight 
of residues 110-373 is 30,074, with a predicted pi of 
10 .4 . 

3 5 The Zsigl3 protein was found to be highly 

expressed in tissues that are exposed to the external 
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environment, including trachea, bladder, small intestine, 
colon, and prostate. This tissue distribution suggests a 
digestive or anti -bacterial function. Several anti- 

bacterial serine proteases are known to be produced in 
5 neutrophils, where they are stored in granules as inactive 
proforms (Gabay, ibid.; Scocchi et al . , ibid.) . 
Expression was also detected in aorta and fetal kidney. 

The present invention also provides isolated 
Zsigl3 polypeptides that are substantially homologous to 
10 the polypeptides of SEQ ID NO: 2 and their orthologs . The 
term "substantially homologous" is used herein to denote 
polypeptides having 50%, preferably 60%, more preferably 
at least 80%, sequence identity to polypeptides sequences 
of SEQ ID NO: 2 or their orthologs. Such polypeptides will 
15 more preferably be at least 90% identical, and most 
preferably 95% or more identical to polypeptides of SEQ ID 
NO : 2 or their orthologs. Percent sequence identity is 
^determined by con ventional metho ds. See, for example, 
^Altschul et al . , Bull. Math. Bio. 48^ 603-616 , 1986 and 
2 0 ""Henikof f and Henikoff , Proc . Natl. Acad. Sci . USA 
89:10915-10919, 1992. Briefly, two amino acid sequences 
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the "blosum 62" scoring matrix of__Henikof.f and Henikoff 



(ibid.) as shown in Ta ble l i (amino acids are indicated by 

the standard one- letter codes) . The percent identity is 

then calculated as : 

Total number of identical matches 
x 100 

3 0 [length of the longer sequence plus the 

number of gaps introduced into the longer 

sequence in order to align the two 

sequences] 
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Sequence identity of polynucleotide molecules 
is determined by similar methods using a ratio as 
disclosed above. 

Substantially homologous proteins and 

5 polypeptides are characterized as having one or more 
amino acid substitutions, deletions or additions. " These 
changes are preferably of a minor nature, that is 
conservative amino acid substitutions (see Table 2) and 
other substitutions that do not significantly affect the 
10 folding or activity of the protein or polypeptide; small 
deletions, typically of one to about 3 0 amino acids; and 
Q small amino- or carboxyl- terminal extensions, such as an 

m amino- terminal methionine residue, a small linker peptide 

111 of up to about 20-25 residues, or a small extension that 

; ?= f 15 facilitates purification (an affinity tag), such as a 

poly-histidine tract, protein A (Nilsson et al . , EMBO J . 
)y 4:1075, 1985; Nilsson et al . , Methods Enzymol . 198:3, 

q 1991) , glutathione S transferase (Smith and Johnson, Gene 

£7:31, 1988), maltose binding protein (Kellerman and 
if! 20 Ferenci, Methods Enzymol . .90:459-463, 1982; Guan et al . , 

i.n Gene £7:21-30, 1987), thioredoxin, ubiquitin, cellulose 

-=y binding protein, T7 polymerase, or other antigenic 

epitope or binding domain. See, in general Ford et al . , 
Protein Expression and Purification 2: 95-107, 1991. 
25 DNAs encoding affinity tags are available from commercial 
suppliers (e.g., Pharmacia Biotech, Piscataway, NJ; New 
England Biolabs, Beverly, MA) . Zsigl3 proteins 

comprising linkers, affinity tags, or other extensions 
will typically be from 283 to 3 98 residues in length, 
3 0 given a polypeptide having an amino terminus within 
residues 1-111 of SEQ ID NO : 2 and a carboxyl terminus at' 
residue 3 73 of SEQ ID NO : 2 , and further comprising an 
extension of 20-25 residues. Those skilled in the art 
will recognize that polypeptides comprising longer 
35 extensions are also within the scope of the present 
invention . 
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Table 2 

Conservative amino acid substitutions 

arginine 
lysine 
hist idine 
glutamic acid 
aspartic acid 
glutamine 
asparagine 
leucine 
isoleucine 
valine 

phenylalanine 
tryptophan 
tyrosine 
glycine 
alanine 
serine 
threonine 
methionine 

The proteins of the present invention can also 
comprise non-naturally occuring amino acid residues . 
Non-naturally occuring amino acids include, without 
25 limitation, trans-3-methylproline, 2 , 4 -methanoproline , 
ci s- 4 -hydroxyproline, trans -4 -hydroxyproline , N- 

methylglycine, alio- threonine , methylthreonine , 

hydroxyethylcysteine , hydroxyethylhomocysteine , 

ni troglutamine , homoglutamine , pipecolic acid, tert- 

30 leucine, norvaline, 2 -azaphenylalanine , 3- 

azaphenylalanine , 4 -azaphenylalanine , and 4- 

f luorophenylalanine . Several methods are known in the 
art for incorporating non-naturally occuring amino acid 
residues into proteins. For example, an in vitro system 

3 5 can be employed wherein nonsense mutations are suppressed 
using chemically aminoacylated suppressor tRNAs . Methods 
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for synthesizing amino acids and aminoacylat ing tRNA are 
known in the art. Transcription and translation of 
plasmids containing nonsense mutations is carried out in 
a cell free system comprising an E. coli S30 extract and 

5 commercially available enzymes and other reagents . 
Proteins are purified by chromatography. See, for 

example, Robertson et al . , J . Am . Chem . Soc . 113 : 2722 , 
1991; Ellman et al . , Methods Enzymol . 202 :301, 1991; 
Chung et al . , Science 259 : 806-809 , 1993; and Chung et 
10 al., Proc. Natl. Acad. Sci . USA 90:10145-10149, 1993). 

In a second method, translation is carried out in Xenopus 
oocytes by microinjection of mutated mRNA and chemically 
aminoacylated suppressor tRNAs (Turcatti et al . , J. Biol . 
Chem. 271 : 19991-19998 , 1996) . Within a third method, E. 
15 coli cells are cultured in the absence of a natural amino 
acid that is to be replaced (e.g., phenylalanine) and in 
the presence of the desired non-naturally occuring amino 
acid(s) (e.g., 2 -azaphenylalanine , 3 -azaphenylalanine , 4- 
azaphenylalanine , or 4-f luorophenylalanine) . The non- 
20 naturally occuring amino acid is incorporated into the 
protein in place of its natural counterpart. See, Koide 
et al . , Biochem . 3.3:7470-7476, 1994. Naturally occuring 
amino acid residues can be converted to non-naturally 
occuring species by in vitro chemical modification. 

25 Chemical modification can be combined with site-directed 
mutagenesis to further expand the range of substitutions 
(Wynn and Richards, Protein Sci. 2 = 395-403, 1993) . 

Essential amino acids in the Zsigl3 
polypeptides of the present invention can be identified 

3 0 according to procedures known in the art, such as site- 
directed mutagenesis or alanine-scanning mutagenesis 
(Cunningham and Wells, Science 244 : 1081-1085, 1989) . In 
the latter technique, single alanine mutations are 
introduced at every residue in the molecule, and the 

35 resultant mutant molecules are tested for biological 
activity as disclosed above to identify amino acid 




16 



residues that are critical to the activity of the 
molecule. See also, Hilton et al . , J. Biol. Chem. 
271 : 4699-4708, 1996. Residues important for substrate 
binding and cleavage can also be determined by physical 
5 analysis of structure, as determined by such techniques 
as nuclear magnetic resonance, crystallography, electron 
diffraction or photoaf f inity labeling, in conjunction 
with mutation of putative contact site amino acids. See, 
for example, de Vos et al . , Science 255 :306-312, 1992; 

10 Smith et al . , J. Mol . Biol. 224.: 899-904 , 1992; Wlodaver 
et al., FEBS Lett. 309:59-64, 1992. The identities of 
essential amino acids can also be inferred from analysis 
of homologies with related serine proteases. 

Multiple amino acid substitutions can be made 

15 and tested using known methods of mutagenesis and 
screening, such as those disclosed by Reidhaar-Olson and 
Sauer ( Science 241 :53-57, 1988) or Bowie and Sauer ( Proc . 
Natl. Acad. Sci . USA 86. : 2 152 - 2 156 , 1989). Briefly, these 
authors disclose methods for simultaneously randomizing 

20 two or more positions in a polypeptide, selecting for 
functional polypeptide, and then sequencing the 
mutagenized polypeptides to determine the spectrum of 
allowable substitutions at each position. Other methods 
that can be used include phage display (e.g., Lowman et 

25 al., Biochem. -30:10832-10837, 1991; Ladner et al . , U.S. 
Patent No. 5,223,409; Huse, WIPO Publication WO 92/06204) 
and region-directed mutagenesis (Derbyshire et al . , Gene 
4_6:145, 1986; Ner et al . , DNA 7:127, 1988) . 

Mutagenesis methods as disclosed above can be 

3 0 combined with high- throughput , automated screening 
methods to detect activity of cloned, mutagenized 
polypeptides in host cells. Mutagenized DNA molecules 
that encode proteolytically active proteins or precursors 
thereof can be recovered from the host cells and rapidly 

35 sequenced using modern equipment. These methods allow 
the rapid determination of the importance of individual 
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amino acid residues in a polypeptide of interest, and can 
be applied to polypeptides of unknown structure. 

Using the methods disclosed above, one of 
ordinary skill in the art can identify and/or prepare a 
5 variety of polypeptides that are substantially homologous 
to residues 111 through 373 of SEQ ID NO : 2 or allelic 
variants thereof and retain the proteolytic properties of 
the wild- type protein. Such polypeptides may include a 
targetting moiety comprising additional amino acid 

10 residues that form an independently folding binding 
domain. Such domains include, for example, an 

extracellular ligand-binding domain (e.g., one or more 
fibronectin type III domains) of a cytokine receptor; 
immunoglobulin domains; DNA binding domains (see, e.g., 

15 He et al . , Nature 378:92-96, 1995); affinity tags; and 
the like. Such polypeptides may also include additional 
polypeptide segments as generally disclosed above. 

In addition to the fusion proteins disclosed 
above, the present invention provides fusions comprising 

20 the secretory peptide of Zsigl3 (residues -19 through -1 
of SEQ ID NO: 2) . This secretory peptide can be used to 
direct the secretion of other proteins of interest by 
joining a polynucleotide sequence encoding it to the 5' 
end of a sequence encoding a protein of interest. 

25 Within the present invention, proteins, 

including variants and fragments of SEQ ID NO: 2, can be 
tested for serine protease activity using conventional 
assays. Briefly, substrate cleavage is conveniently 
assayed using a tetrapeptide that mimics the cleavage 

3 0 site of the natural substrate and which is linked, via a 
peptide bond, to a carboxyl- terminal para-nitro-anilide 
(pNA) group. The protease hydrolyzes the bond between 
the fourth amino acid residue and the pNA group, causing 
the pNA group to undergo a dramatic increase in 

35 absorbance at 405 nm. Such substrates will preferably 
contain a Glu or Asp residue at the PI position. 
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Suitable substrates can be synthesized according to known 
methods or obtained from commercial suppliers. When the 
serine protease is prepared as an inactive precursor 
(e.g., comprising N-terminal residues 1-109 of SEQ ID 
NO : 2 ) , it is activated by cleavage with a suitable 
protease (e.g., furin (Steiner et al . , J. Biol. Chem. 
26J7: 23435-23438, 1992)) prior to assay. Assays of this 
type are well known in the art. See, for example, 
Lottenberg et al . , Thrombosis Research 2.8:313-332, 1982; 
Cho et al., Biochem. 2_3:644-650, 1984; Foster et al . , 
Biochem. 26. : 7003 -7011 , 1987) . 

The isolated polynucleotides of the present 
invention include DNA and RNA. Methods for isolating DNA 
and RNA are well known in. the art. For example, RNA can 
be isolated from trachea, bladder, small intestine, 
colon, or prostate, which RNA is then used as a template 
for preparation of complementary DNA (cDNA) . DNA can 
also be prepared using RNA from other tissues or isolated 
as genomic DNA. Total RNA can be prepared using 

guanidine HC1 extraction followed by isolation by 
centrif ugation in a CsCl gradient (Chirgwin et al . , 
Biochemistry 18:52-94, 1979). Poly (A) + RNA is prepared 
from total RNA using the method of Aviv and Leder ( Proc . 
Natl. Acad. Sci . USA 69:1408-1412, 1972). Complementary 
DNA (cDNA) is prepared from poly (A) + RNA using known 
methods. Polynucleotides encoding Zsigl3 polypeptides 
are then identified and isolated by, for example, 
hybridization or polymerase chain reaction (PCR) . 

Within SEQ ID NO : 1 and SEQ ID NO : 2 , residues 
80, 95, 96, and 149 can be any amino acid residue 
(denoted as Xaa) . Within a preferred embodiment of the 
invention, residue 80 is Thr, residue 95 is Gin, residue 
96 is His, and residue 149 is Lys . 

A second Zsigl3 DNA sequence is shown in SEQ ID 
NO: 14 (with the corresponding amino acid sequence shown 
in SEQ ID NO:15). Within SEQ ID NO:15, residue 60 is 
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Glu, residue 80 is Thr, residue 95 is Gin, residue 96 is 
His, residue 149 is Lys, residue 299 is Ser, and residue 
369 is Pro. All other residues in SEQ ID NO: 15 are the 
same as their respective counterparts in SEQ ID NO : 2 . 
5 The calculated molecular weight of the peptide backbone 
of the 392-residue polypeptide shown in SEQ ID NO: 15 is 
43,918.56, with a predicted pi of 10.38. The calculated 
peptide backbone molecular weight of residues 110-373 is 
28,113.80, with a predicted pi of 10.49. 

10 Those skilled in the art will recognize that 

the sequences disclosed in SEQ ID NO -.14 and SEQ ID NO: 15 
represent a single allele of the human Zsigl3 gene and 
polypeptide, and that allelic variation and alternative 
splicing are expected to occur. Allelic variants can be 

15 cloned by probing cDNA or genomic libraries from 
different individuals according to standard procedures. 
Allelic variants of the DNA sequence shown in SEQ ID NO: 
14, including those containing silent mutations and those 
in which mutations result in amino acid sequence changes, 

2 0 are within the scope of the present invention, as are 
proteins which are allelic variants of SEQ ID NO: 15. 

The invention also encompasses degenerate 
polynucleotide sequences encoding proteins as disclosed 
above. Those skilled in the art will readily recognize 

25 that, in view of the degeneracy of the genetic code, 
considerable sequence variation is possible among these 
polynucleotide molecules. SEQ ID NO: 16 is a degenerate 
DNA sequence that encompasses all DNAs that encode the 
Zsigl3 polypeptide of SEQ ID NO: 15. Those skilled in the 

30 art will recognize that the degenerate sequence of SEQ ID 
NO: 16 also provides all RNA sequences encoding SEQ ID 
NO: 15 by substituting U for T. Thus, Zsigl3 polypeptide- 
encoding polynucleotides comprising segments of SEQ ID 
NO: 16 and their RNA equivalents are contemplated by the 

35 present invention. Table 3 sets forth the one-letter 
codes used within SEQ ID NO: 16 to denote degenerate 



nucleotide positions. "Resolutions" are the nucleotides 
denoted by a code letter. "Complement" indicates the 
code for the complementary nucleotide (s) . For example, 
the code Y denotes either C or T, and its complement R 
denotes A or G, A being complementary to T, and G being 
complementary to C. 
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The degenerate codons used in SEQ ID NO: 16, 
encompassing all possible codons for a given amino acid, 
are set forth in Table 4, below. 
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One of ordinary skill in the art will 
appreciate that some ambiguity is introduced in 
determining a degenerate codon, representative of all 
possible codons encoding each amino acid. For example, 
the degenerate codon for serine (WSN) can, in some 
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circumstances, encode arginine (AGR) , and the degenerate 
codon for arginine (MGN) can, in some circumstances, 
encode serine (AGY) . A similar relationship exists 
between codons encoding phenylalanine and leucine. Thus, 
5 some polynucleotides encompassed by the degenerate 
sequence may encode variant amino acid sequences, but one 
of ordinary skill in the art can easily identify such 
variant sequences by reference to the amino acid sequence 
of SEQ ID NO: 15. Variant sequences can be readily tested 
10 for functionality as described herein. 

For any Zsigl3 polypeptide, including variants 
j«=s ' and fusion proteins, one of ordinary skill in the art can 

! fi readily generate a fully degenerate polynucleotide 

:S sequence encoding that variant using the information set 

jtj 15 forth in Tables 3 and 4, above. 

]= Z Allelic variants and orthologs of the human 

;U Zsigl3 protein shown in SEQ ID NO: 15 can be obtained by 

conventional cloning methods. The DNA sequence shown in 
'% SEQ ID N0:1 or SEQ ID NO: 14 or portions thereof can be 

M= 2 0 used as probes or primers to prepare other 

^ polynucleotides from cells or libraries (including cDNA 

S]g and genomic libraries) from humans or other animals of 

interest, particularly mammals including rodents, 
rabbits, ungulates, primates, and others of economic 
25 importance or biomedical interest. It is preferred to 
derive probes and primers from regions of the molecule 
that are relatively conserved within the family of serine 
proteases, such as residues 141-146, 153-158, 209-214, 
and 224-229 of SEQ ID NO : 2 . Methods for isolating 
3 0 additional polynucleotides are known in the art. For 
example, a cDNA can be cloned using mRNA obtained from a 
tissue or cell type that expresses the protein. Suitable 
sources of mRNA can be identified by probing Northern 
blots with probes designed from the sequences disclosed 
3 5 herein. Preferred sources of mRNA include trachea, small 
intestine, colon, prostate, and bladder. A library is 
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then prepared from mRNA of a positive tissue or cell 
line . A cDNA of interest can then be isolated by a 
variety of methods, such as by probing with a complete or 
partial human cDNA or with one or more sets of degenerate 
5 probes based on the disclosed sequences. A cDNA can also 
be cloned using the polymerase chain reaction, or PCR 
(Mullis, U.S. Patent 4,683,202), using primers designed 
from the sequences disclosed herein. Of particular 
interest for cloning are degenerate probes and primers 
10 designed from the regions of SEQ ID NO : 2 disclosed above 
and alignment with other serine proteases. Families of 
preferred degenerate probes are shown in Table 5 . 



Complement 
AY NAD NSW NCC NGT RCA 

(SEQ ID NO:4) 
AT RCA RTG NSC NGC NGT 

(SEQ ID NO:6) 
CCA NCC NBW NGG NAY RW 
(SEQ ID NO:8) 
SC RTA RTC RTA RTY NRT 
(SEQIDNO:10) 
!5 

Within an additional method, the cDNA library 
can be used to transform or transfect host cells, and 
expression of the cDNA of interest can be detected with 
an antibody that specifically binds to an epitope of a 

20 Zsigl3 polypeptide. Similar techniques can also be 

applied to the isolation of genomic clones. 

Within preferred embodiments of the invention 
the isolated polynucleotides will hybridize to similar 
sized regions of SEQ ID NO : 1 or SEQ ID NO: 14, or a 

2 5 sequence complementary to SEQ ID NO:l or SEQ ID NO: 14, 
under stringent conditions. In general, stringent 



== ; Table 5 

tf = 

U Nucleotides 

J (SEQ ID NO:1) Sense 

Ty 582-598 TGY ACN GGN WSN HTN RT 

!L (SEQ ID NO:3) 

Jj 61 8-634 ACN GCN GSN CAY TGY AT 

^ (SEQ ID NO:5) 

^ 787-803 WY RTN CCN WVN GGN TGG 

5 (SEQ ID NO:7) 

83 1 -847 AYN RAY TAY GAY TAY GS 

(SEQ ID NO:9) 



# 
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conditions are selected to be about 5°C lower than the 
thermal melting point (T m ) for the specific sequence at a 
defined ionic strength and pH. The T m is the temperature 
(under defined ionic strength and pH) at which 50% of the 
5 target sequence hybridizes to a perfectly matched probe. 
Typical stringent conditions are those in which the salt 
concentration does not exceed about 0.03 M at pH 7 and 
the temperature is at least about 60°C, with washes 
carried out in the presence of EDTA. 
10 The polypeptides of the present invention, 

including full-length proteins, fragments thereof, and 
q fusion proteins, are produced in genetically engineered 

host cells according to conventional techniques. 
m Suitable host cells are those cell types that can be 

ill 15 transformed or transfected with exogenous DNA and grown 

: s» in culture, and include bacteria, fungal cells, and 

iU cultured higher eukaryotic cells. Techniques for 

;L. manipulating cloned DNA molecules and introducing 

.j~ exogenous DNA into a variety of host cells are disclosed 

i= 2 0 by Sambrook et al . , Molecular Cloning: A Laboratory 

;=[ Manual , 2nd ed. , Cold Spring Harbor Laboratory Press, 

m Cold Spring Harbor, NY, 1989. 

In general, a DNA sequence encoding a protein 
of the present invention is operably linked to a 
2 5 transcription promoter and terminator within an 
expression vector. The vector will commonly contain one 
or more selectable markers and one or more origins of 
replication, although those skilled in the art will 
recognize that within certain systems selectable markers 
30 can be provided on separate vectors, and replication of 
the exogenous DNA can be provided by integration into the 
host cell genome. Selection of promoters, terminators, 
selectable markers, vectors and other elements is a 
matter of routine design within the level of ordinary 
35 skill in the art. Many such elements are described in 
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the literature and are available through commercial 
suppliers . 

To direct Zsigl3 polypeptides into the 
secretory pathway of a host cell, a secretory signal 
5 sequence (also known as a leader sequence, prepro 
sequence or pre sequence) is provided in the expression 
vector. The secretory signal sequence is joined to a DNA 
sequence encoding a Zsigl3 polypeptide in the correct 
reading frame. Secretory signal sequences are commonly 

10 positioned 5' to the DNA sequence encoding the protein of 
interest, although certain signal sequences may be 
positioned 3' to the DNA sequence of interest (see, e.g., 
Welch et al . , U.S. Patent No. 5,037,743 ; Holland et al . , 
U.S. Patent No. 5,143,830). The secretory signal 

15 sequence of Zsigl3 (e.g., the human secretory signal 
sequence of SEQ ID NO : 1 from nucleotide 105 to nucleotide 
161) is generally preferred for use in mammalian cells. 
Signals from host cell genes may be preferred in other 
types of cells (e.g., yeast cells). 

20 Yeast cells, particularly cells of the genus 

Saccharomyces , are suitable for use within the present 

invention. Methods for transforming yeast cells with 
exogenous DNA and producing recombinant proteins 
therefrom are disclosed by, for example, Kawasaki, U.S. 
25 Patent No. 4,599,311; Kawasaki et al . , U.S. Patent No. 

4,931,373; Brake, U.S. Patent No. 4,870,008; Welch et 
al., U.S. Patent No. 5,03 7,74 3; and Murray et al . , U.S. 
Patent No. 4,845,075. A preferred vector system for use 
in yeast is the POT1 vector system disclosed by Kawasaki 

30 et al. (U.S. Patent No. 4,931,373), which allows 
transformed cells to be selected by growth in glucose- 
containing media. Transformation systems for other 
yeasts, including Hansenula polymorpha, 
, Schizosaccharomyces pombe, Kluyveromyces lac t is, 

35 Kluyveromyces fragilis , Ustilago maydis, Pichia pastoris , 
Pichia methanol ica and Candida maltosa are known in the 
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art. See, for example, Gleeson et al . , J . Gen . 

Microbiol . 132 : 3459-3465 , 1986; Cregg, U.S. Patent No. 
4,882,279; and Hiep et al . , Yeast 9 : 118 9 - 11 97 , 1993. 

The use of Pichia methanol ica as host for the 

5 production of recombinant proteins is disclosed in WIPO 
Publications WO 97/17450, WO 97/17451, WO 98/02536, and 
WO 98/02565; and U.S. Patent No. 5,716,808. DNA 
molecules for use in transforming P. methanolica will 

commonly be prepared as double- stranded, circular 
10 plasmids, which are preferably linearized prior to 
transformation. For polypeptide production in P. 

methanolica , it is preferred that the promoter and 
terminator in the plasmid be that of a P. methanolica 
gene, such as a P. methanolica alcohol utilization gene 
15 (AUG1 or AUG2) . Other useful promoters include those of 
the dihydroxyacet one synthase (DHAS) , formate 

dehydrogenase (FMD) , and catalase (CAT) genes. . To 
facilitate integration of the DNA into the host 

chromosome, it is preferred to have the entire expression 

2 0 segment of the plasmid flanked at both ends by host DNA 

sequences. A preferred selectable marker for use in 
Pichia methanolica is a P. methanolica ADE2 gene, which 

encodes phosphoribosyl-5-aminoimidazole carboxylase 

(AIRC; EC 4.1.1.21), which allows ade2 host cells to grow 

25 in the absence of adenine. For large-scale, industrial 
processes where it is desirable to minimize the use of 
methanol, it is preferred to use host cells in which both 
methanol utilization genes (AUG1 and AUG2) are deleted. 

For production of secreted proteins, host cells deficient 

3 0 in vacuolar protease genes ( PEP 4 and PRB1) are preferred. 

Electroporation is used to facilitate the introduction of 
a plasmid containing DNA encoding a polypeptide of 
interest into P. methanolica cells. It is preferred to 
transform P. methanolica cells by electroporation using 
35 an exponentially decaying, pulsed electric field having a 
field strength of from 2.5 to 4.5 kV/cm, preferably about 
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3.75 kv/cm, and a time constant (t) of from 1 to 40 
milliseconds, most preferably about 20 milliseconds. 

Other fungal cells are also suitable as host 
cells. For example, Aspergillus cells can be utilized 

according to the methods of McKnight et al . , U.S. Patent 
No. 4,935,349. Methods for transforming Acremonium 

chrysogenum are disclosed by Sumino et al . , U.S. Patent 
No. 5,162,228. 

Cultured mammalian cells can also be used as 
hosts . Methods for introducing exogenous DNA into 

mammalian host cells include calcium phosphate-mediated 
transfection (Wigler et al . , Cell 14.: 725 , 1978; Corsaro 
and Pearson, Somatic Cell Genetics 7:603, 1981: Graham 
and Van der Eb, Virology 5_2:456, 1973), electroporat ion 
(Neumann et al . , EMBO J . 1: 841-845, 1982) and DEAE- 
dextran mediated transfection (Ausubel et al . , eds . , 
Current Protocols in Molecular Biology , John Wiley and 
Sons, Inc., NY, 1987) . The production of recombinant 
proteins in cultured mammalian cells is disclosed by, for 
example, Levinson et al . , U.S. Patent No. 4,713,339; 
Hagen et al . , U.S. Patent No. 4,784,950; Palmiter et al . , 
U.S. Patent No. 4,579,821; and Ringold, U.S. Patent No. 
4,656,134. Preferred cultured mammalian cells include 
the COS-1 (ATCC No. CRL 1650), COS-7 (ATCC No. CRL 1651), 
BHK (ATCC No. CRL 1632) , BHK 570 (ATCC No. CRL 10314) and 
293 (ATCC No. CRL 1573; Graham et al . , J. Gen. Virol. 
36.: 59-72, 1977) cell lines. Additional suitable cell 
lines are known in the art and available from public 
depositories such as the American Type Culture 
Collection, Rockville, Maryland. 

Other higher eukaryotic cells can also be used 
as hosts, including insect cells, plant cells and avian 
cells. Transformation of insect cells and production of 
foreign proteins therein is disclosed by Guarino et al . , 
U.S. Patent No. 5,162,222 and Bang et al . , U.S. Patent 
No. 4,775,624. The use of Agrohacterium rhizogen.es as a 
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vector for expressing genes in plant cells has been 
reviewed by Sinkar et al . , J. Biosci. (Bangalore) 11:47- 
58, 1987. 

Prokaryotic host cells for use in carrying out 
5 the present invention include strains of the bacteria 
Escherichia, coli; Bacillus and other genera are also 

useful. Techniques for transforming these hosts and 
expressing foreign DNA sequences cloned therein are well 
known in the art (see, e.g., Sambrook et al . , ibid.). 

10 When expressing a Zsigl3 protein in bacteria such as E. 
coli, the protein may be retained in the cytoplasm, 
typically as insoluble granules, or may be directed to 
the periplasmic space by a bacterial secretion sequence. 
In the former case, the cells are lysed, and the granules 

15 are recovered and denatured using, for example, guanidine 
isothiocyanate or urea. The denatured protein can then 
be then refolded and dimerized by diluting the 
denaturant, such as by dialysis against a solution of 
urea and a combination of reduced and oxidized 

2 0 glutathione, followed by dialysis against a buffered 

saline solution. In the latter case, the protein can be 
recovered from the periplasmic space in a soluble and 
functional form by disrupting the cells (by, for example, 
sonication or osmotic shock) to release the contents of 

25 the periplasmic space and recovering the protein, thereby 
obviating the need for denaturation and refolding. 

The secretory peptide of Zsigl3 (residues -19 
through -1 of SEQ ID NO : 2 ) can be used to direct the 
secretion of other proteins of interest from a host cell. 

30 Such use is within the level of ordinary skill in the 
art. Briefly, a DNA segment encoding the Zsigl3 

secretory peptide is operably linked to a second DNA 
segment encoding a protein of interest within a host cell 
and the cell is cultured according to conventional 

3 5 methods as summarized below. The protein of interest is 

then recovered from the culture media. 
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Transformed or transfected host cells are 
cultured according to conventional procedures in a 
culture medium containing nutrients and other components 
required for the growth of the chosen host cells. A 
5 variety of suitable media, including defined media and 
complex media, are known in the art and generally include 
a carbon source, a nitrogen source, essential amino 
acids, vitamins and minerals. Media may also contain 
such components as growth factors or serum, as required. 

10 The growth medium will generally select for cells 
containing the exogenously added DNA by, for example, 
drug selection or deficiency in an essential nutrient 
which is complemented by the selectable marker carried on 
the expression vector or co- transfected into the host 

15 cell. P. methanol ica cells are cultured in a medium 

comprising adequate sources of carbon, nitrogen and trace 
nutrients at a temperature of about 25°C to 35°C. Liquid 
cultures are provided with sufficient aeration by 
conventional means, such as shaking of small flasks or 

2 0 sparging of f ermentors . A preferred culture medium .for 

P. methanolica is YEPD. 

Recombinant Zsigl3 polypeptides (including 
chimeric polypeptides) can be purified from cells or cell 
culture media using conventional fractionation and 
25 purification methods and media. Ammonium sulfate 

precipitation and acid or chaotrope extraction may be 
used for fractionation of samples. Exemplary 
purification steps include hydroxyapatite , size 
exclusion, FPLC and reverse-phase high performance liquid 

3 0 chromatography. Suitable anion exchange media include 

derivatized dextrans, agarose, cellulose, polyacrylamide , 
specialty silicas, and the like. Exemplary 
chromatographic media include those media derivatized 
with phenyl, butyl, or octyl groups, such as Phenyl - 
35 Sepharose FF (Pharmacia), Toyopearl butyl 650 (Toso Haas, 
Mont gomeryvi lie , PA), Octyl -Sepharose (Pharmacia) and the 
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like; or polyacrylic resins, such as Amberchrom CG 71 
(Toso Haas) and the like. Suitable solid supports 

include glass beads, silica-based resins, cellulosic 
resins, agarose beads, cross -linked agarose beads, 
5 polystyrene beads, cross-linked polyacrylamide resins and 
the like that are insoluble under the conditions in which 
they are to be used. These supports can be modified with 
reactive groups that allow attachment of proteins by 
amino groups, carboxyl groups, sulfhydryl groups, 
10 , hydroxy 1 groups and/or carbohydrate moieties. Examples 
; of coupling chemistries include cyanogen bromide 
9 ■ activation, N-hydroxysuccinimide activation, epoxide 

activation, sulfhydryl activation, hydrazide activation, 
Oft a and carboxyl and amino derivatives for carbodiimide 

I'f 15 ^coupling chemistries. These and other solid media are 

,£ well known and widely used in the art, and are available 

* y from commercial suppliers. Selection of a particular 

:=! method is a matter of routine design and is determined in 

=C part by the properties of the chosen support. See, for 

^ 2 0 example, Affinity Chromatography: Principles & Methods , 

.n Pharmacia LKB Biotechnology, Uppsala, Sweden, 1988. 

= fU Activated serine proteases are preferably purified by 

binding to immobilized p-aminobenzamidine (e.g., 
Benzamidine-Sepharose® ; Pharmacia) with subsequent 
25 elution using soluble benzamidine (Winkler et al . , 
Bio /Technology 2:990, 1985; Mizuno et al . , Biochem. 
Biophys . Res. Comm. 144 : 807 . 1987) . 

Proteins comprising affinity tags or other 
• binding domains can be purified by exploiting the 
3 0 properties of the additional domain. For example, 

immobilized metal ion adsorption chromatography (IMAC) 
can be used to purify histidine-rich proteins, including 
proteins comprising poly-histidine tags. Briefly, a gel 
is first charged with divalent metal ions to form a 
35 chelate (Sulkowski, Trends in Biochem. 3_:l-7, 1985) . 

Histidine-rich proteins will be adsorbed to this matrix 
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with differing affinities, depending upon the metal ion 
used, and will be eluted by competitive elution, lowering 
the pH, or use of strong chelating agents. Other methods 
of purification include purification of glycosylated 
5 proteins by lectin affinity chromatography and ion 
exchange chromatography ("Guide to Protein Purification", 
Methods Enzvmol . , Vol. 182, M. Deutscher, (ed.), Academic 
Press, San Diego, 1990, pp. 529-39). 

Zsigl3 polypeptides can also be prepared 
10 through chemical synthesis. The polypeptides may be 
glycosylated or non-glycosylated ; pegylated or non- 
! =jf pegylated; and may or may not include an initial 

r=i methionine amino acid residue. 

! f; When proteins are produced intracellular ly 

jjf 15 (such as in prokaryotic host cells) or by in vitro 

■■E synthesis, protein refolding (and optionally reoxidation) 

1 *" procedures as generally disclosed above are 

Q advantageously used. 

It is preferred to purify Zsigl3 proteins to 
ha 20 >80% purity, more preferably to >90% purity, even more 

Q preferably >95%, and particularly preferred is a 

^ pharmaceut ically pure state, that is greater than 99.9% 

pure with respect to contaminating macromolecules , 
particularly other proteins and nucleic acids, and free 
25 of infectious and pyrogenic agents. Preferably, a 

purified protein is substantially free of other proteins, 
particularly other proteins of animal origin. 

Proteins of the present invention can be used 
within laboratory and industrial settings to cleave 
3 0 proteins for a variety of purposes that will be evident 
to those skilled in the art . The proteins can be used 
alone to provide specific proteolysis or can be combined 
with other proteases to provide a "cocktail" with a broad 
spectrum of activity. Representative laboratory uses 
3 5 include the removal of proteins from biological samples, 
such as preparations of nucleic acids; and for digesting 
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proteins in conjunction with peptide mapping and 
sequencing. Within industry, the proteins of the present 
invention can be formulated in laundry detergents to aid 
in the removal of protein stains, and can be used within 
5 the large scale preparation of recombinant proteins to 
specifically cleave fusion proteins, including removing 
affinity tags. The proteins of the present invention can 
be added to a variety of compositions and solutions as 
proteolyt ically active enzymes or as protease precursors. 
10 In the latter arrangement, the protein is subsequently 
activated, such as by the addition of an activating 
protease . 

The proteins of the present invention are also 
useful as research reagents to identify novel protease 

15 inhibitors. Briefly, test samples (compounds, broths, 
extracts, and the like) are added to protease assays as 
disclosed above to determine their ability to inhibit 
substrate cleavage. Inhibitors identified in this way 
can be used in industry and research to reduce or prevent 

20 undesired proteolysis. As with proteases, inhibitors can 
be combined to increase the spectrum of activity. 

Zsigl3 proteins and protein fragments can also 
be used to prepare antibodies that specifically bind to 
zsigl3 proteins. As used herein, the term "antibodies" 

25 includes polyclonal antibodies, monoclonal antibodies, 
antigen-binding fragments thereof such as F(ab')2 an d Fab 
fragments, single chain antibodies, and the like, 
including genetically engineered antibodies. Non-human 
antibodies can be humanized by grafting non-human CDRs 

3 0 onto human framework and constant regions, or by 
incorporating the entire non-human variable domains 
(optionally "cloaking" them with a human-like surface by 
replacement of exposed residues, wherein the result is a 
"veneered" antibody) . In some instances, humanized 

35 antibodies may retain non-human residues within the human 
variable region framework domains to enhance proper 
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binding characteristics. Through humanizing antibodies, 
biological half-life can be increased, and the potential 
for adverse immune reactions upon administration to 
humans is reduced. One skilled in the art can generate 
5 humanized antibodies with specific and different constant 
domains (i.e., different Ig subclasses) to facilitate or 
inhibit various immune functions associated with 
particular antibody constant domains . Alternative 
techniques for generating or selecting antibodies useful 
10 herein include in vitro exposure of lymphocytes to Zsigl3 
protein, and selection of antibody display libraries in 
= phage or similar vectors (for instance, through use of 

□ immobilized or labeled Zsigl3 protein) . Antibodies are 

;Jj defined to be specifically binding if they bind to a 

LI 15 Zsigl3 protein with an affinity at least 10-fold greater 

.jp than the binding affinity to control (non-Zsigl3) 

protein. The affinity of a monoclonal antibody can be 
O readily determined by one of ordinary skill in the art 

:+= (see, for example, Scatchard, Ann. NY Acad. Sci. 51 : 660- 

Q 20 672, 1949) . 

y3 Methods for preparing polyclonal and monoclonal 

°~ antibodies are well known in the art (see for example, 

Hurrell, J. G. R., Ed., Monoclonal Hybridoma Antibodies: 
Techniques and Applications . CRC Press, Inc., Boca Raton, 

25 FL, 1982) . As would be evident to one of ordinary skill 
in the art, polyclonal antibodies can be generated from a 
variety of warm-blooded animals such as horses, cows, 
goats, sheep, dogs, chickens, rabbits, mice, and rats. 
The immunogenicity of a Zsigl3 polypeptide can be 

3 0 increased through the use of an adjuvant such as alum 
(aluminum hydroxide) or Freund ' s complete or incomplete 
adjuvant. Polypeptides useful for immunization also 
include fusion polypeptides, such as fusions of a Zsigl3 
protein or a portion thereof with an immunoglobulin 

35 polypeptide or with maltose binding protein. The 
polypeptide immunogen may be a full-length molecule or a 
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portion thereof. If the polypeptide portion is "hapten- 
like", such portion may be advantageously joined or 
linked to a macromolecular carrier (such as keyhole 
limpet hemocyanin (KLH) , bovine serum albumin (BSA) or 
5 tetanus toxoid) for immunization. 

A variety of assays known to those skilled in 
the art can be utilized to detect antibodies which 
specifically bind to Zsigl3 proteins. Exemplary assays 
are described in detail in Antibodies: A Laboratory 

10 Manual . Harlow and Lane (Eds.) , Cold Spring Harbor 
Laboratory Press, 1988. Representative examples of such 
assays include: concurrent Immunoelectrophoresis, radio- 
immunoassays, radio- immunoprecipitat ions , enzyme-linked 
immunosorbent assays (ELISA) , dot blot assays, Western 

15, blot assays, inhibition or competition assays, and 
sandwich assays. 

Antibodies to Zsigl3 proteins can be used for 
affinity purification of the protein, ^ within diagnostic 
assays for determining circulating levels of the protein; 

2 0 for detecting or quantitating soluble Zsig.13 protein or 

protein fragments as a marker of underlying pathology or 
disease; for immunolocalizat ion within whole animals or 
tissue sections, including immunodiagnostic applications; 
for immunohistochemistry ; and as antagonists to block 
25 protein activity in vitro and in vivo. Antibodies to 
Zsigl3 can also be used for tagging cells that express 
Zsigl3; for affinity purification of Zsigl3 proteins; in 
analytical methods employing FACS; for screening 
expression libraries; and for generating anti- idiotypic 

3 0 antibodies. For certain applications, including in vitro 

and in vivo diagnostic uses, it is advantageous to employ 
labeled antibodies. Suitable direct tags or labels 
include radionuclides, enzymes, substrates, cof actors, 
inhibitors, fluorescent markers, chemiluminescent 

35 markers, magnetic particles and the like; indirect tags 
or labels may feature use of biotin-avidin or other 
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complement/anti-complement pairs as intermediates. 
Antibodies of the present invention can also be directly 
or indirectly conjugated to drugs, toxins, radionuclides 
and the like, and these conjugates used for in vivo 
5 diagnostic or therapeutic applications . 

While not wishing to be bound by theory, tissue 
distribution of Zsigl3 mRNA suggests that the protein may 
play a defensive role. Proteases that serve anitbiotic 
or antitoxin functions are known (Gabay, ibid.,- Scocchi 
10 et al . , ibid.) . Proteins of the present invention may 
thus be useful as antibiotics and/or antitoxins. They 
"• may further be used as diagnostic indicators of infection 
□ by assaying body fluids for the presence of Zsigl3 . 

;~! Zsigl3 proteins or fragments thereof can be detected 

!=t 15 using, for example, immunoassay techniques employing 

jh antibodies specific for Zsigl3 epitopes. Assays can be 

performed using soluble or immobilized antibodies in a 
3 variety of known formats . 

A Zsigl3 gene, a probe comprising Zsigl3 DNA or 
■-J 2 0 RNA, or a subsequence thereof can be used to determine if 

E the Zsigl3 gene is present on chromosome 11 or if a 

mutation has occurred. Detectable chromosomal 

aberrations at the Zsigl3 gene locus include, but are not 
limited to, aneuploidy, gene copy number changes, 
25 insertions, deletions, restriction site changes and 
rearrangements. These aberrations can occur within the 
coding sequence, within introns, or within flanking 
sequences, including upstream promoter and regulatory 
regions, and may be manifested as physical alterations 
3 0 within a coding sequence or changes in gene expression 
level. Analytical probes will generally be at least 20 
nucleotides in length, although somewhat shorter probes 
(14-17 nucleotides) can be used. PCR primers are at 
least 5 nucleotides in length, preferably 15 or more nt, 
35 more preferably 20-30 nt . Short polynucleotides can be 
used when a small region of the gene is targetted for 
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analysis. For gross analysis of genes, a polynucleotide 
probe may comprise an entire exon or more. Probes will 
generally comprise a polynucleotide linked to a signal - 
generating moiety such as a radionucleot ide . In general, 
5 gene-based diagnostic methods comprise the steps of (a) 
obtaining a genetic sample from a patient; (b) incubating 
the genetic sample with a polynucleotide probe or primer 
as disclosed above, under conditions wherein the 
polynucleotide will hybridize to complementary 

10 polynucleotide sequence, to produce a first reaction 
product; and (iii) comparing the first reaction product 
to a control reaction product. A difference between the 
first reaction product and the control reaction product 
is indicative of a genetic abnormality in the patient. 

15 Genetic samples for use within the present invention 
include genomic DNA, cDNA, and RNA. The polynucleotide 
probe or primer can be RNA or DNA, and will comprise a 
portion of SEQ ID NO : 1 or SEQ ID NO: 14, the complement of 
SEQ ID NO:l or SEQ ID NO : 14 , or an RNA equivalent 

2 0 thereof. Suitable assay methods in this regard include 
molecular genetic techniques known to those in the art, 
such as restriction fragment length polymorphism (RFLP) 
analysis, short tandem repeat (STR) analysis employing 
. PCR techniques, ligation chain reaction (Barany, PCR 

25 Methods and Applications 1:5-16, 1991), ribonuclease 

protection assays, and other genetic linkage analysis 
techniques known in the art (Sambrook et al . , ibid.; 
Ausubel et. al . , ibid.; A.J. Marian, Chest 108 : 255-65 . 

1995) . Ribonuclease protection assays (see, e.g., 

30 Ausubel et al . , ibid., ch. 4) comprise the hybridization 

of an RNA probe to a patient RNA sample, after which the' 
reaction product (RNA -RNA hybrid) is exposed to RNase . 
Hybridized regions of the RNA are protected from 
digestion. Within PCR assays, a patient genetic sample 
35 is incubated with a pair of polynucleotide primers, and 
the region between the primers is amplified and 
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recovered. Changes in size or amount of recovered 

product are indicative of mutations in the patient. 
Another PCR-based technique that can be employed is 
single strand conformational polymorphism (SSCP) analysis 
5 (Hayashi, PCR Methods and Applications 1:34-38, 1991) . 

Radiation hybrid mapping is a somatic cell 
genetic technique developed for constructing high- 
resolution, contiguous maps of mammalian chromosomes (Cox 
et al., Science 250 : 245-250 , 1990). Partial or full 

10 knowledge of a gene's sequence allows one to design PCR 
primers suitable for use with chromosomal radiation 
hybrid mapping panels. Commercially available radiation 
hybrid mapping panels that cover the entire human genome, 
such as the Stanford G3 RH Panel and the GeneBridge . 4 RH 

15 Panel (Research Genetics, Inc., Huntsville, AL) , are 
available. These panels enable rapid, PCR-based 

chromosomal localizations and ordering of genes, 
sequence -tagged sites (STSs) , and other nonpolymorphic 
and polymorphic markers within a region of interest . 

20 This technique allows one to establish directly 
proportional physical distances between newly discovered 
genes of interest and previously mapped markers. The 
precise knowledge of a gene's position can be useful for 
a number of purposes, including: 1) determining 

25 relationships between short sequences and obtaining 
additional surrounding genetic sequences in various 
forms, such as YACs , BACs or cDNA clones; 2) providing a 
possible candidate gene for an inheritable disease which 
shows linkage to the same chromosomal region; and 3) 

30 cross-referencing model organisms, such as mouse, which 
may aid in determining what function a particular gene 
might have . 

The invention is further illustrated by the 
following, non- limiting examples. 



35 



1 

# 



38 



Example 1 

Tissue distribution of Zsigl3 mRNA was analyzed 
using Human Multiple Tissue Northern Blots (obtained from 
Clontech, Inc., Palo Alto, CA) . A 40-bp DNA probe (ZC 
5 11,667; SEQ ID NO: 11) was radioact ively labeled with 32 P 
using T4 polynucleotide kinase and forward reaction 
buffer (GIBCO BRL, Gaithersburg, MD) according to the 
supplier's specifications. The probe was purified using 
a push column (Nuctrap™ column; Stratagene Cloning 

10 Systems, La Jolla, CA) . Prehybridizat ion and 

hybridization were carried out in a commercially 
available solution (ExpressHyb™ hybridization solution; 
Clontech Laboratories, Inc., Palo Alto, CA) . Blots were 
hybridized overnight at 42°C, washed in 2X SSC, 0.05% SDS 

15 at room temperature, then in IX SSC, 0.1% SDS at 60°C. 

Two transcripts were observed: a strongly hybridizing 
~1.8 kb band and a fainter band at approximately 4.0 kb . 

An RNA Master Dot Blot (Clontech Laboratories) 
that contained RNAs from various tissues that were 

2 0 normalized to eight housekeeping genes was also probed 

with the 40-bp oligonucleotide probe (SEQ ID NO: 11) . The 
blot was prehybridized, then hybridized overnight with 10 6 
cpm/ml of probe of 42°C according to the manufacturer's 
specifications. The blot was washed with 2X SSC, 0.05% 
25 SDS at room temperature, then in IX SSC, 0.1% SDS at 
60 °C. After a four-day exposure, signals were seen in 
trachea, aorta, bladder, and fetal kidney. 

Example 2 

3 0 Zsigl3 was mapped to chromosome 11 using the 

commercially available GeneBridge 4 Radiation Hybrid 
Panel (Research Genetics, Inc., Huntsville, AL) . The 
GeneBridge 4 Radiation Hybrid Panel contains PCRable DNAs 
from each of 93 radiation hybrid clones, plus two control 
35 DNAs (the HFL donor and the A23 recipient) . A publicly 
available WWW server (http://www-genome.wi.mit.edu/cgi- 
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bin/contig/rhmapper .pi) allows mapping relative to the 
Whitehead Institute/MIT Center for Genome Research 
(WICGR) radiation hybrid map of the human genome, which 
was constructed- with the GeneBridge 4 Radiation Hybrid 
5 Panel . 

For the mapping of Zsigl3, 20 /xl reaction 
mixtures were set up in a PCRable 96 -well microtiter 
plate (Stratagene Cloning Systems, La Jolla, CA) and 
incubated in a thermal cycler (RoboCycler™ Gradient 96; 
10 Stratagene Cloning Systems) . Each of the 95 PCR 

reactions consisted of 2 /xl 10X KlenTaq PCR reaction 
buffer (Clontech Laboratories, Inc.), 1.6 /xl dNTPs mix 
Q (2.5 mM each, Perkin-Elmer , Foster City, CA) , 1 /xl sense 

:f! primer (ZC 13,508; SEQ ID NO: 12), 1 /xl antisense primer 

■ y 

15 (ZC 13,509; SEQ ID NO: 13), 2 /xl of a commercially 
;4= available density increasing agent and tracking dye 

[~ (RediLoad; Research Genetics, Inc., Huntsville, AL) , 0.4 

O /xl of polymerase/antibody mixture (5 OX Advantage™ KlenTaq 

jj7 Polymerase Mix; Clontech Laboratories, Inc.), 25 ng of 

'■-4 2 0 DNA from an individual hybrid clone or control and ddH 2 0 

=? for a total volume of 20 /xl . The reaction mixtures were 

overlaid with an equal amount of mineral oil and sealed. 
The PCR cycler conditions were as follows: an initial 5 
minute denaturation at 95°C; 3 5 cycles of a 1 minute 
25 denaturation at 95°C, 1 minute annealing at 62°C and 1.5 
minute extension at 72°C; followed by a final extension of 
7 minutes at 72°C. The reaction products were separated 
by electrophoresis on a 3% NuSieve® GTG agarose gel (FMC 
Bioproducts, Rockland, ME) . 
30 The results showed that Zsigl3 maps 417.10 

cR_3000 distal from the top of the human chromosome 11 
linkage group on the WICGR radiation hybrid map. 
Proximal and distal framework markers were D11S1979 and 
D11S2384, respectively. The use of surrounding markers 
35 positions Zsigl3 in the llq22.1 region on the integrated 
LDB chromosome 11 map (The Genetic Location Database, 
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University of Southhampton, WWW server: 

http : //cedar . genet ics . soton.ac.uk/public_html/) . This 
region of chromosome 11 is fairly rich in proteases. 

5 From the foregoing, it will be appreciated 

that, although specific embodiments of the invention have 
been described herein for purposes of illustration, 
various modifications may be made without deviating from 
the spirit and scope of the invention. Accordingly, the 
10 invention is not limited except as by the appended 
claims . 



